When looking at the time course data, we noticed an interesting pattern: it seemed as though it would be possible to predict an individual’s group from the difference between their load effect during encoding and their load effect during delay. It seemed as though low capacity subjects had small load effects at both encoding and delay, medium capacity subjects had large load effects at both encoding and delay and high capacity subjects had large load effects at encoding but small ones at delay. As such, we wanted to create a model that used this information to predict span.
To do so, we extracted data from TR 6 for the encoding period and TR 8 for the delay period. This represented 1 TR’s worth of data in the middle of where we’d expect activity from for each period (from our model of the task convolved with a hemodynamic delay function) and where we see maximal differences between group. In addition to the raw load effects, we created two composite variables that we expected to capture the differences between groups. The first was simply the difference between load effects at encoding and delay, and the second was the sum of that difference and the load effect effect at encoding. These measures allow us to create a series of regression models to predict span, BPRS total score and accuracy at high load.
In addition to the univariate load effects, we have also seen a relationship with various measures reflecting multivariate representation across and within trial, including the probability of a MVPA classifer predicting a face at any given point in a trial and the similarity of multivariate representations across and within trials. Both of these measures were taken at each individual trial and averaged, in addition to taken from the average over many trials (which served as a template for the canonical trial type). We added these multivariate measures to see if they improved model performance when added to regression models that only contained variables based on univariate load effects.
Overall, we were best able to explain variance in high load accuracy, with our best model able to explain 40% of the variance. In contrast, we were only able to explain 12% of the variance in span, and 7% of the variance in BPRS scores.
The individual components related to each of these measures will be discussed further below, but one interesting thing to note is that there is largely not much overlap. The first PC of the similarity measures (which mostly related to within TR high load correct trials (regardless of area) and across TR similarity at high load in the fusiform), was important for predicting both span and accuracy, while the encoding - delay difference in parietal regions was important in both accuracy and BPRS scores. There was no overlap between components implicated in span and BPRS.
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2 ✓ purrr 0.3.4
## ✓ tibble 3.0.3 ✓ dplyr 1.0.1
## ✓ tidyr 1.1.1 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(psych)
##
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
library(reshape2)
##
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
##
## smiths
library(rmatio)
library(factoextra)
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
library(patchwork)
library(MASS)
##
## Attaching package: 'MASS'
## The following object is masked from 'package:patchwork':
##
## area
## The following object is masked from 'package:dplyr':
##
## select
library(caret)
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
library(glmnet)
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
## Loaded glmnet 4.0-2
load('data/behav.RData')
load('data/split_groups_info.RData')
load('data/ITC_fusiform.RData')
similarity_fusiform <- similarity_temp
load('data/ITC_DFR_delay.RData')
similarity_DFR <- similarity_temp
load("data/MVPA_fusiform.RData")
individual_trial_probs_fusiform <- individual_trial_averages_probs
averages_from_template_fusiform <- averages_from_template
load("data/MVPA_DFR_delay_mask.RData")
individual_trial_probs_DFR <- individual_trial_averages_probs
averages_from_template_DFR <- averages_from_template
load("data/MVPA_HPC.RData")
individual_trial_probs_HPC <- individual_trial_averages_probs
averages_from_template_HPC <- averages_from_template
DFR_ROIs <- c("L aMFG","L dlPFC","L dMFG","L IPS","L preSMA","R dlPFC","R IPS","R medParietal","all delay ROIs")
First, we’re going to load in the time courses of the activity from the DFR regions. We’re doing this a little differently from last time, because we don’t want the interpolated data, we just want the one with 14 TRs.
We’re going to remove the subjects that aren’t included in the split group analysis, and put the data in a slightly easier to use format.
temp <- read.mat('data/RSA_DFR_trials.mat')
factors <- matrix(nrow=170)
for (idx in seq.int(1,170)){
if (constructs_fMRI$PTID[idx] %in% WM_groups[["high"]]$PTID){
factors[idx] <- "high"
}else if (constructs_fMRI$PTID[idx] %in% WM_groups[["med"]]$PTID){
factors[idx] <- "med"
}else if (constructs_fMRI$PTID[idx] %in% WM_groups[["low"]]$PTID){
factors[idx] <- "low"
}else{
factors[idx] <- "not_incl"
}
}
factors <- factor(factors, levels=c("low","med","high","not_incl"))
trial_activity_high <- temp[["trial_avg_activity_high"]]
trial_activity_low <- temp[["trial_avg_activity_low"]]
all_data <- list(subjs=constructs_fMRI$PTID,
WMC = factors,
L_aMFG = list(
high = data.frame(trial_activity_high[,,1,1]),
low = data.frame(trial_activity_low[,,1,1])
),
L_dlPFC = list(
high = data.frame(trial_activity_high[,,1,2]),
low = data.frame(trial_activity_low[,,1,2])
),
L_dMFG = list(
high = data.frame(trial_activity_high[,,1,3]),
low = data.frame(trial_activity_low[,,1,3])
),
L_IPS = list(
high = data.frame(trial_activity_high[,,1,4]),
low = data.frame(trial_activity_low[,,1,4])
),
L_preSMA = list(
high = data.frame(trial_activity_high[,,1,5]),
low = data.frame(trial_activity_low[,,1,5])
),
R_dlPFC = list(
high = data.frame(trial_activity_high[,,1,6]),
low = data.frame(trial_activity_low[,,1,6])
),
R_IPS = list(
high = data.frame(trial_activity_high[,,1,7]),
low = data.frame(trial_activity_low[,,1,7])
),
R_medParietal = list(
high = data.frame(trial_activity_high[,,1,8]),
low = data.frame(trial_activity_low[,,1,8])
),
all_delay = list(
high = data.frame(trial_activity_high[,,1,9]),
low = data.frame(trial_activity_low[,,1,9])
)
)
Now, we’re going to calculate load effects for each ROI at every time point.
for (ROI in seq.int(3,11)){
all_data[[ROI]][["load_effect"]] <- all_data[[ROI]][["high"]]-all_data[[ROI]][["low"]]
temp <- data.frame(group = all_data[["WMC"]], all_data[[ROI]][["load_effect"]])
all_data[[ROI]][["SVM"]] <- temp
}
Now, let’s pull out thhe encoding and delay values - we’re also going to calculate the difference between encoding and delay, and encoding + (encoding-delay), because we think that this is going to be useful. We’re going to include all these measures, plus span, in a convenient dataframe.
for (ROI in seq.int(3,11)){
temp <- data.frame(matrix(nrow=170, ncol=6))
colnames(temp) <- c("group","encoding","delay","encoding_delay","encoding_delay_comb" ,"span")
temp$group <- all_data[["WMC"]]
temp$span <- constructs_fMRI$omnibus_span_no_DFR_MRI
temp$encoding <- all_data[[ROI]][["load_effect"]][,6]
#temp$delay <- rowMeans(all_data[[ROI]][["load_effect"]][,8:9])
temp$delay <- all_data[[ROI]][["load_effect"]][,8]
temp$encoding_delay <- temp$encoding - temp$delay
temp$encoding_delay_comb <- temp$encoding + temp$encoding_delay
temp$high_acc <- p200_data$XDFR_MRI_ACC_L3[p200_data$PTID %in% constructs_fMRI$PTID]
temp$BPRS <- p200_clinical_zscores$BPRS_TOT[p200_clinical_zscores$PTID %in% constructs_fMRI$PTID]
all_data[[ROI]][["SVM_2"]] <- temp
}
All regions except for L aMFG have significant relationships
ROI_plot_list <- list()
for (ROI in seq.int(3,11)){
correlation_test <- cor.test(all_data[[ROI]][["SVM_2"]]$span,all_data[[ROI]][["SVM_2"]]$encoding_delay)
ROI_plot_list[[DFR_ROIs[ROI-2]]] <- ggplot(data=all_data[[ROI]][["SVM_2"]])+
geom_point(aes(x=span,y=encoding_delay_comb,color=group))+
stat_smooth(aes(x=span,y=encoding_delay_comb),method="lm",color="black")+
ggtitle(paste(DFR_ROIs[ROI-2], ", r = ",round(correlation_test$estimate,digits=3)))+
xlab("WM Span")+
ylab("LE Difference")+
theme_classic()
}
(ROI_plot_list[[1]] + ROI_plot_list[[2]]) /
(ROI_plot_list[[3]] + ROI_plot_list[[4]]) +
plot_annotation(title = "Correlation between (encoding LE + encoding/delay LE difference) and span ")+
plot_layout(guides="collect")
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
(ROI_plot_list[[5]] + ROI_plot_list[[6]]) /
(ROI_plot_list[[7]] + ROI_plot_list[[8]]) +
plot_layout(guides="collect")
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
ROI_plot_list[[9]]
## `geom_smooth()` using formula 'y ~ x'
data_for_reg <- data.frame(span = all_data[["all_delay"]][["SVM_2"]]$span,
high_acc = all_data[["all_delay"]][["SVM_2"]]$high_acc, BPRS = all_data[["all_delay"]][["SVM_2"]]$BPRS,
L_aMFG_enc = all_data[[3]][["SVM_2"]]$encoding, L_aMFG_delay = all_data[[3]][["SVM_2"]]$delay,
L_dlPFC_enc = all_data[[4]][["SVM_2"]]$encoding, L_dlPFC_delay = all_data[[4]][["SVM_2"]]$delay,
L_dMFG_enc = all_data[[5]][["SVM_2"]]$encoding, L_dMFG_delay = all_data[[5]][["SVM_2"]]$delay,
L_IPS_enc = all_data[[6]][["SVM_2"]]$encoding, L_IPS_delay = all_data[[6]][["SVM_2"]]$delay,
L_preSMA_enc = all_data[[7]][["SVM_2"]]$encoding, L_preSMA_delay = all_data[[7]][["SVM_2"]]$delay,
R_dlPFC_enc = all_data[[8]][["SVM_2"]]$encoding, R_dlPFC_delay = all_data[[8]][["SVM_2"]]$delay,
R_IPS_enc = all_data[[9]][["SVM_2"]]$encoding, R_IPS_delay = all_data[[9]][["SVM_2"]]$delay,
R_medPar_enc = all_data[[10]][["SVM_2"]]$encoding, R_medPar_delay = all_data[[10]][["SVM_2"]]$delay,
all_delay_enc = all_data[[11]][["SVM_2"]]$encoding, all_delay_delay = all_data[[11]][["SVM_2"]]$delay,
L_aMFG_enc_delay = all_data[[3]][["SVM_2"]]$encoding_delay,
L_dlPFC_enc_delay = all_data[[4]][["SVM_2"]]$encoding_delay,
L_dMFG_enc_delay = all_data[[5]][["SVM_2"]]$encoding_delay,
L_IPS_enc_delay = all_data[[6]][["SVM_2"]]$encoding_delay,
L_preSMA_enc_delay = all_data[[7]][["SVM_2"]]$encoding_delay,
R_dlPFC_enc_delay = all_data[[8]][["SVM_2"]]$encoding_delay,
R_IPS_enc_delay = all_data[[9]][["SVM_2"]]$encoding_delay,
R_medPar_enc_delay = all_data[[10]][["SVM_2"]]$encoding_delay,
L_aMFG_enc_delay_comb = all_data[[3]][["SVM_2"]]$encoding_delay_comb,
L_dlPFC_enc_delay_comb = all_data[[4]][["SVM_2"]]$encoding_delay_comb,
L_dMFG_enc_delay_comb = all_data[[5]][["SVM_2"]]$encoding_delay_comb,
L_IPS_enc_delay_comb = all_data[[6]][["SVM_2"]]$encoding_delay_comb,
L_preSMA_enc_delay_comb = all_data[[7]][["SVM_2"]]$encoding_delay_comb,
R_dlPFC_enc_delay_comb = all_data[[8]][["SVM_2"]]$encoding_delay_comb,
R_IPS_enc_delay_comb = all_data[[9]][["SVM_2"]]$encoding_delay_comb,
R_medPar_enc_delay_comb = all_data[[10]][["SVM_2"]]$encoding_delay_comb,
R_all_delay_enc_delay_comb = all_data[[11]][["SVM_2"]]$encoding_delay_comb,
high_corr_fus_enc = similarity_fusiform[["high_correct_avg"]]$X6,
high_corr_fus_delay = similarity_fusiform[["high_correct_avg"]]$X8,
high_incorr_fus_enc = similarity_fusiform[["high_incorrect_avg"]]$X6,
high_incorr_fus_del = similarity_fusiform[["high_incorrect_avg"]]$X8,
low_corr_fus_enc = similarity_fusiform[["low_correct_avg"]]$X6,
low_corr_fus_del = similarity_fusiform[["low_correct_avg"]]$X8,
correct_encoding_to_correct_delay_fus = similarity_fusiform[["correct_encoding_to_correct_delay"]],
correct_enc_to_delay_fus_high_corr = similarity_fusiform[["correct_encoding_to_delay_avg"]][,4],
enc_to_correct_delay_fus_high_corr = similarity_fusiform[["encoding_to_correct_delay_avg"]][,4],
high_corr_DFR_enc = similarity_DFR[["high_correct_avg"]]$X6,
high_corr_DFR_delay = similarity_DFR[["high_correct_avg"]]$X8,
high_incorr_DFR_enc = similarity_DFR[["high_incorrect_avg"]]$X6,
high_incorr_DFR_del = similarity_DFR[["high_incorrect_avg"]]$X8,
low_corr_DFR_enc = similarity_DFR[["low_correct_avg"]]$X6,
low_corr_DFR_del = similarity_DFR[["low_correct_avg"]]$X8,
correct_encoding_to_correct_delay_DFR = similarity_DFR[["correct_encoding_to_correct_delay"]],
correct_enc_to_delay_DFR_high_corr = similarity_DFR[["correct_encoding_to_delay_avg"]][,4],
enc_to_correct_delay_DFR_high_corr = similarity_DFR[["encoding_to_correct_delay_avg"]][,4],
indiv_trial_avgs_high_correct_fusiform_enc = individual_trial_probs_fusiform[["high_correct"]]$V6,
indiv_trial_avgs_high_correct_fusiform_delay = individual_trial_probs_fusiform[["high_correct"]]$V8,
indiv_trial_avgs_high_correct_fusiform_probe = individual_trial_probs_fusiform[["high_correct"]]$V11,
indiv_trial_avgs_high_correct_DFR_enc = individual_trial_probs_DFR[["high_correct"]]$V6,
indiv_trial_avgs_high_correct_DFR_delay = individual_trial_probs_DFR[["high_correct"]]$V8,
indiv_trial_avgs_high_correct_DFR_probe = individual_trial_probs_DFR[["high_correct"]]$V11,
indiv_trial_avgs_high_incorrect_fusiform_enc = individual_trial_probs_fusiform[["high_incorrect"]]$V6,
indiv_trial_avgs_high_incorrect_fusiform_delay = individual_trial_probs_fusiform[["high_incorrect"]]$V8,
indiv_trial_avgs_high_incorrect_fusiform_probe = individual_trial_probs_fusiform[["high_incorrect"]]$V11,
indiv_trial_avgs_high_incorrect_DFR_enc = individual_trial_probs_DFR[["high_incorrect"]]$V6,
indiv_trial_avgs_high_incorrect_DFR_delay = individual_trial_probs_DFR[["high_incorrect"]]$V8,
indiv_trial_avgs_high_incorrect_DFR_probe = individual_trial_probs_DFR[["high_incorrect"]]$V11,
indiv_trial_avgs_low_correct_fusiform_enc = individual_trial_probs_fusiform[["low_correct"]]$V6,
indiv_trial_avgs_low_correct_fusiform_delay = individual_trial_probs_fusiform[["low_correct"]]$V8,
indiv_trial_avgs_low_correct_fusiform_probe = individual_trial_probs_fusiform[["low_correct"]]$V11,
indiv_trial_avgs_low_correct_DFR_enc = individual_trial_probs_DFR[["low_correct"]]$V6,
indiv_trial_avgs_low_correct_DFR_delay = individual_trial_probs_DFR[["low_correct"]]$V8,
indiv_trial_avgs_low_correct_DFR_probe = individual_trial_probs_DFR[["low_correct"]]$V11,
averages_from_template_high_correct_fusiform_enc = averages_from_template_fusiform[["high_correct"]]$V6,
averages_from_template_high_correct_fusiform_delay = averages_from_template_fusiform[["high_correct"]]$V8,
averages_from_template_high_correct_fusiform_probe = averages_from_template_fusiform[["high_correct"]]$V11,
averages_from_template_high_correct_DFR_enc = averages_from_template_DFR[["high_correct"]]$V6,
averages_from_template_high_correct_DFR_delay = averages_from_template_DFR[["high_correct"]]$V8,
averages_from_template_high_correct_DFR_probe = averages_from_template_DFR[["high_correct"]]$V11,
averages_from_template_high_incorrect_fusiform_enc = averages_from_template_fusiform[["high_incorrect"]]$V6,
averages_from_template_high_incorrect_fusiform_delay = averages_from_template_fusiform[["high_incorrect"]]$V8,
averages_from_template_high_incorrect_fusiform_probe = averages_from_template_fusiform[["high_incorrect"]]$V11,
averages_from_template_high_incorrect_DFR_enc = averages_from_template_DFR[["high_incorrect"]]$V6,
averages_from_template_high_incorrect_DFR_delay = averages_from_template_DFR[["high_incorrect"]]$V8,
averages_from_template_high_incorrect_DFR_probe = averages_from_template_DFR[["high_incorrect"]]$V11,
averages_from_template_low_correct_fusiform_enc = averages_from_template_fusiform[["low_correct"]]$V6,
averages_from_template_low_correct_fusiform_delay = averages_from_template_fusiform[["low_correct"]]$V8,
averages_from_template_low_correct_fusiform_probe = averages_from_template_fusiform[["low_correct"]]$V11,
averages_from_template_low_correct_DFR_enc = averages_from_template_DFR[["low_correct"]]$V6,
averages_from_template_low_correct_DFR_delay = averages_from_template_DFR[["low_correct"]]$V8,
averages_from_template_low_correct_DFR_probe = averages_from_template_DFR[["low_correct"]]$V11
)
From these plots, we can see that our measeures are high correlated.
pairs.panels(data_for_reg[,c(4,6,8,10,12,14,16,18)], density=TRUE)
pairs.panels(data_for_reg[,c(5,7,9,11,13,15,17,19)], density=TRUE)
pairs.panels(data_for_reg[,c(22:29)], density= TRUE)
pairs.panels(data_for_reg[,c(39:44,49:53)])
pairs.panels(data_for_reg[,c(45:47,53:56)])
pairs.panels(data_for_reg[,c(57:59,63:65,69:71)])
pairs.panels(data_for_reg[,c(60:62,66:68,72:74)])
pairs.panels(data_for_reg[,c(75:77,81:83,87:89)])
pairs.panels(data_for_reg[,c(78:80,84:86,90:92)])
We know that our variables are highly correlated, so to deal with multi-collinearity, we’re going to try to run a PCA on all the encoding load effects, delay load effects and encoding - delay. It looks like the first 3 dimensions carry most of the variance (77%), so we’re going to stick with those three.
The first variable loads on encoding load effects and the combined encoding+encoding-delay in the PFC regions, load effect during delay, the second on delay load effect and the third on encoding - delay in parietal regions.
res.pca <- prcomp(data_for_reg[,c(4:19, 22:37)], scale = TRUE)
fviz_eig(res.pca)
summary(res.pca)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 3.7908 2.811 1.65063 1.09084 1.00491 0.93056 0.80212
## Proportion of Variance 0.4491 0.247 0.08514 0.03719 0.03156 0.02706 0.02011
## Cumulative Proportion 0.4491 0.696 0.78119 0.81837 0.84993 0.87699 0.89710
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## Standard deviation 0.76707 0.71693 0.68174 0.64478 0.58292 0.55500 0.53242
## Proportion of Variance 0.01839 0.01606 0.01452 0.01299 0.01062 0.00963 0.00886
## Cumulative Proportion 0.91549 0.93155 0.94607 0.95906 0.96968 0.97931 0.98817
## PC15 PC16 PC17 PC18 PC19 PC20
## Standard deviation 0.47888 0.38646 7.17e-16 3.563e-16 3.271e-16 3.249e-16
## Proportion of Variance 0.00717 0.00467 0.00e+00 0.000e+00 0.000e+00 0.000e+00
## Cumulative Proportion 0.99533 1.00000 1.00e+00 1.000e+00 1.000e+00 1.000e+00
## PC21 PC22 PC23 PC24 PC25
## Standard deviation 3.249e-16 3.249e-16 3.249e-16 3.249e-16 3.249e-16
## Proportion of Variance 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
## Cumulative Proportion 1.000e+00 1.000e+00 1.000e+00 1.000e+00 1.000e+00
## PC26 PC27 PC28 PC29 PC30
## Standard deviation 3.249e-16 3.249e-16 3.249e-16 3.249e-16 3.249e-16
## Proportion of Variance 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
## Cumulative Proportion 1.000e+00 1.000e+00 1.000e+00 1.000e+00 1.000e+00
## PC31 PC32
## Standard deviation 3.249e-16 8.156e-17
## Proportion of Variance 0.000e+00 0.000e+00
## Cumulative Proportion 1.000e+00 1.000e+00
res.var <- get_pca_var(res.pca)
res.var$contrib # Contributions to the PCs
## Dim.1 Dim.2 Dim.3 Dim.4
## L_aMFG_enc 5.3891319 0.47119180 0.888115784 0.36395876
## L_aMFG_delay 0.8226601 8.30932520 0.161685785 5.30343379
## L_dlPFC_enc 5.1548713 0.19826396 2.373438821 2.75026006
## L_dlPFC_delay 0.8379575 8.01469715 0.117445309 4.15914356
## L_dMFG_enc 3.8686045 0.85706847 1.164896007 6.16741622
## L_dMFG_delay 0.1022154 7.46291800 0.008000602 9.16696186
## L_IPS_enc 4.7377580 1.39325395 1.224394885 0.39584075
## L_IPS_delay 1.0231861 8.19702653 0.338261276 0.03533589
## L_preSMA_enc 5.1446938 0.32011044 2.217134905 3.17955097
## L_preSMA_delay 0.8476296 7.65183437 0.109230087 2.04246883
## R_dlPFC_enc 4.8411481 0.58400281 0.660233258 8.57086183
## R_dlPFC_delay 0.4665321 8.49680253 0.001860368 3.88823718
## R_IPS_enc 4.6816446 1.10080762 4.315200235 2.04909637
## R_IPS_delay 1.3096340 7.49613390 0.465936108 6.41418179
## R_medPar_enc 3.2808053 1.10075964 7.876768303 6.19113578
## R_medPar_delay 0.9113219 5.91997922 0.227520498 16.91789118
## L_aMFG_enc_delay 2.5324146 4.73490205 2.032870916 2.81926059
## L_dlPFC_enc_delay 2.7789692 4.99259214 1.960353495 0.01134452
## L_dMFG_enc_delay 2.6097636 3.62950352 0.951155098 0.44867111
## L_IPS_enc_delay 2.7587888 3.17762904 4.472413792 1.07537730
## L_preSMA_enc_delay 2.9225616 3.56323490 3.901315764 0.49062485
## R_dlPFC_enc_delay 3.2093707 3.99485392 0.750404939 1.80412108
## R_IPS_enc_delay 1.8786624 3.44107339 11.369746524 1.30541423
## R_medPar_enc_delay 1.1590509 2.19063291 14.875619645 2.74212080
## L_aMFG_enc_delay_comb 5.0769737 0.62074833 1.804032417 0.31835510
## L_dlPFC_enc_delay_comb 4.9534742 0.83706460 2.729483644 0.85938824
## L_dMFG_enc_delay_comb 4.3869148 0.33860905 1.443745630 1.10438546
## L_IPS_enc_delay_comb 4.9935247 0.01495033 3.124513783 0.85155346
## L_preSMA_enc_delay_comb 4.9097908 0.42407936 3.603717963 1.98007920
## R_dlPFC_enc_delay_comb 5.1030170 0.36967175 0.889223751 6.01662206
## R_IPS_enc_delay_comb 4.3856493 0.08036282 9.428010721 0.10822018
## R_medPar_enc_delay_comb 2.9212793 0.01591632 14.513269687 0.46868698
## Dim.5 Dim.6 Dim.7 Dim.8
## L_aMFG_enc 2.191788e-03 0.20964029 7.712767e+00 0.03824134
## L_aMFG_delay 9.687208e-02 0.12110441 2.321384e-02 4.55847334
## L_dlPFC_enc 9.108968e-02 2.60273811 4.210358e+00 2.42948142
## L_dlPFC_delay 1.339513e-01 1.30724088 1.770957e-01 4.50186461
## L_dMFG_enc 7.905999e+00 12.72284008 2.654925e+00 9.12659161
## L_dMFG_delay 6.012391e-01 0.17156884 3.215105e+00 43.05370597
## L_IPS_enc 8.176725e+00 1.68875826 1.766065e+00 0.23846660
## L_IPS_delay 9.968879e-01 0.51285308 2.272607e+00 0.05392658
## L_preSMA_enc 1.261368e-01 2.62051097 2.434397e+00 0.30489165
## L_preSMA_delay 2.111877e+00 1.35803414 1.751194e-01 3.64893694
## R_dlPFC_enc 4.696895e-01 1.64840315 9.644213e+00 0.04769402
## R_dlPFC_delay 7.404954e-01 0.06356039 4.579126e-01 2.20293291
## R_IPS_enc 1.568018e+00 0.21383279 2.051265e-02 0.06899940
## R_IPS_delay 1.502662e-05 0.31090846 1.174455e+00 0.03354544
## R_medPar_enc 8.147835e+00 0.14723808 8.788306e-01 2.49026583
## R_medPar_delay 2.969242e-01 8.09328107 2.103845e-01 1.66990002
## L_aMFG_enc_delay 1.337204e-01 0.02035708 9.962873e+00 3.80214280
## L_dlPFC_enc_delay 4.798263e-01 0.50183655 3.607073e+00 0.08887450
## L_dMFG_enc_delay 1.283097e+01 9.61391157 5.748055e-02 14.34202288
## L_IPS_enc_delay 6.579791e+00 6.36066813 3.115424e-04 0.13433045
## L_preSMA_enc_delay 8.585519e-01 0.57214691 4.560442e+00 1.25899066
## R_dlPFC_enc_delay 5.780333e-03 2.82995347 7.894993e+00 1.46069449
## R_IPS_enc_delay 2.412406e+00 1.48548791 2.042080e+00 0.28961034
## R_medPar_enc_delay 7.673975e+00 13.23109526 3.561902e-01 0.17864213
## L_aMFG_enc_delay_comb 5.231659e-02 0.12274324 1.145020e+01 0.91244969
## L_dlPFC_enc_delay_comb 2.963849e-01 1.76958663 4.922183e+00 0.58801344
## L_dMFG_enc_delay_comb 1.400786e+01 15.20150770 6.512064e-01 0.22069299
## L_IPS_enc_delay_comb 9.772634e+00 4.38907468 7.248432e-01 0.24829542
## L_preSMA_enc_delay_comb 7.528156e-02 1.79605150 4.094281e+00 0.06804776
## R_dlPFC_enc_delay_comb 1.344995e-01 2.74422258 1.114123e+01 0.26258824
## R_IPS_enc_delay_comb 2.572048e+00 0.85541583 6.865057e-01 0.19881970
## R_medPar_enc_delay_comb 1.064800e+01 4.71342794 8.201517e-01 1.47786682
## Dim.9 Dim.10 Dim.11 Dim.12
## L_aMFG_enc 13.278170174 0.109731528 4.033356e-01 0.42801950
## L_aMFG_delay 1.342858718 0.917286020 1.994876e+00 1.93776465
## L_dlPFC_enc 0.878516318 4.317530491 7.051428e+00 1.45697474
## L_dlPFC_delay 1.268545237 0.189190762 2.162303e+00 2.38867922
## L_dMFG_enc 0.146003756 0.065136254 1.029398e+00 0.38021257
## L_dMFG_delay 0.046964789 0.437974138 1.770550e-03 0.57839925
## L_IPS_enc 0.013463451 4.682815175 1.505287e+00 4.31087651
## L_IPS_delay 1.172436421 18.403683386 8.471263e+00 0.13718635
## L_preSMA_enc 5.133685342 6.624477714 4.407162e-01 0.48161426
## L_preSMA_delay 5.289776045 9.108170935 6.100037e+00 0.23304554
## R_dlPFC_enc 0.144442372 7.774814960 3.882862e-02 0.40397005
## R_dlPFC_delay 4.251145239 14.739325156 5.736845e+00 9.68615414
## R_IPS_enc 1.793446114 0.420869858 6.020588e+00 7.20554551
## R_IPS_delay 1.606333951 2.110097765 7.432278e+00 1.54425894
## R_medPar_enc 0.035308393 0.960331963 8.150868e+00 1.41203351
## R_medPar_delay 0.066299785 0.891710489 1.293636e+01 2.40534491
## L_aMFG_enc_delay 7.651060649 1.762566526 5.523870e-01 4.47906410
## L_dlPFC_enc_delay 0.001208272 7.590465302 2.444434e+00 8.11881280
## L_dMFG_enc_delay 0.362105252 0.185686047 1.096361e+00 0.03075567
## L_IPS_enc_delay 1.219408960 4.649779406 3.157977e+00 5.16920385
## L_preSMA_enc_delay 21.575960240 0.019513572 2.271140e+00 1.48951803
## R_dlPFC_enc_delay 2.538278841 0.409947298 6.577741e+00 14.11435216
## R_IPS_enc_delay 0.041211220 0.756275304 8.914974e-03 3.62824139
## R_medPar_enc_delay 0.260708314 0.010770587 4.145118e-01 0.10776837
## L_aMFG_enc_delay_comb 13.572919213 0.857864744 9.702658e-04 2.40692837
## L_dlPFC_enc_delay_comb 0.288758625 7.212139669 5.756511e+00 4.92392973
## L_dMFG_enc_delay_comb 0.332049066 0.011016627 1.454776e+00 0.06533131
## L_IPS_enc_delay_comb 0.224594341 0.107951579 7.993100e-03 6.12184543
## L_preSMA_enc_delay_comb 13.999429621 2.489407071 1.566848e-01 1.08323273
## R_dlPFC_enc_delay_comb 0.385329145 1.732547397 2.189785e+00 5.63480254
## R_IPS_enc_delay_comb 0.928287991 0.001042925 2.325207e+00 7.30853075
## R_medPar_enc_delay_comb 0.151294148 0.449879351 2.108423e+00 0.32760312
## Dim.13 Dim.14 Dim.15 Dim.16
## L_aMFG_enc 2.055304420 1.301848254 11.122356297 4.714024e-01
## L_aMFG_delay 10.229182560 0.172424578 28.663347755 3.203287e+00
## L_dlPFC_enc 3.348577186 0.352525524 4.559006342 4.803500e-01
## L_dlPFC_delay 24.838072816 12.859716846 5.179478325 2.557820e+00
## L_dMFG_enc 0.159879157 1.056206127 0.056890684 1.199356e-01
## L_dMFG_delay 0.042679075 0.077706875 0.008190717 3.792668e-03
## L_IPS_enc 0.698135805 0.045158330 2.074201114 7.537944e+00
## L_IPS_delay 0.928100936 0.294634121 0.414719789 2.286884e+01
## L_preSMA_enc 0.430217645 11.910671123 0.002018598 1.615081e-04
## L_preSMA_delay 0.037398610 31.019799712 2.329721060 9.499498e-01
## R_dlPFC_enc 1.276603225 3.381070858 0.283093903 2.854704e+00
## R_dlPFC_delay 4.705479940 5.596719026 0.316058002 8.685663e+00
## R_IPS_enc 0.536009590 0.004190656 0.109314453 1.169153e+01
## R_IPS_delay 4.922707021 0.313529270 5.846535005 2.425481e+01
## R_medPar_enc 0.367270582 1.665193314 0.030146981 1.046965e-01
## R_medPar_delay 4.961217757 9.065157070 0.774502953 1.337381e-01
## L_aMFG_enc_delay 2.848848821 0.661086897 3.292189228 1.143972e+00
## L_dlPFC_enc_delay 7.606770498 7.822773151 0.040342374 5.869520e-01
## L_dMFG_enc_delay 0.033197376 0.531090587 0.020302180 1.649701e-01
## L_IPS_enc_delay 4.860458134 0.124357162 6.872486750 3.880322e+00
## L_preSMA_enc_delay 0.314523092 1.444614239 1.800915875 8.162426e-01
## R_dlPFC_enc_delay 0.739542733 0.067461955 1.311378135 9.912395e-01
## R_IPS_enc_delay 12.052059523 0.320088162 10.253686599 2.047910e+00
## R_medPar_enc_delay 3.149255630 3.388709658 1.416772407 6.277596e-01
## L_aMFG_enc_delay_comb 0.005755984 1.264365038 0.968108786 3.235184e-02
## L_dlPFC_enc_delay_comb 0.137089243 1.262646001 1.882061558 4.476829e-05
## L_dMFG_enc_delay_comb 0.115462146 1.053822341 0.049539608 1.939339e-01
## L_IPS_enc_delay_comb 2.679451418 0.001427464 4.997653816 6.362974e-01
## L_preSMA_enc_delay_comb 0.456080563 1.940086974 0.451459509 2.279751e-01
## R_dlPFC_enc_delay_comb 0.045918522 0.910781278 0.854790771 2.253636e-01
## R_IPS_enc_delay_comb 5.104207601 0.062578579 3.470203187 2.117068e+00
## R_medPar_enc_delay_comb 0.314542392 0.027558829 0.548527238 3.889646e-01
## Dim.17 Dim.18 Dim.19 Dim.20
## L_aMFG_enc 0.000000000 0.00000000 0.00000000 0.00000000
## L_aMFG_delay 0.006101393 0.12110579 0.33680427 0.07045756
## L_dlPFC_enc 1.043325227 6.93374374 7.42820073 2.68422171
## L_dlPFC_delay 0.272418316 6.05502573 2.83978943 3.99269389
## L_dMFG_enc 7.105352254 0.87351929 0.58105734 1.57192438
## L_dMFG_delay 10.149786307 2.32240612 1.61853821 2.21295991
## L_IPS_enc 1.956502374 1.73081613 0.62900719 2.00492042
## L_IPS_delay 0.287554752 3.94494665 0.38563134 4.31671601
## L_preSMA_enc 1.242245842 3.09874780 0.59041769 2.07214541
## L_preSMA_delay 0.084453627 2.74456742 1.34708257 5.42118126
## R_dlPFC_enc 5.371738369 6.98657500 2.38553148 12.00019383
## R_dlPFC_delay 0.147391522 0.16898178 2.26243513 2.50893744
## R_IPS_enc 0.255242007 0.39368903 10.74327827 3.15055996
## R_IPS_delay 1.120680510 0.15652832 5.60520297 3.39359000
## R_medPar_enc 0.588011120 0.01982035 11.58793138 1.53307154
## R_medPar_delay 1.087232077 2.95184434 14.70134269 0.25795260
## L_aMFG_enc_delay 0.023816427 0.47272928 1.31469552 0.27502693
## L_dlPFC_enc_delay 3.958528177 7.56967335 1.13110754 7.17713513
## L_dMFG_enc_delay 12.081560782 4.02733433 2.86794014 2.60435856
## L_IPS_enc_delay 3.998033370 5.83630208 0.21351394 6.23291404
## L_preSMA_enc_delay 0.125670500 4.26732828 3.46262360 14.62037209
## R_dlPFC_enc_delay 1.660245095 2.32304156 2.86288232 0.01909799
## R_IPS_enc_delay 2.039783425 0.03281356 2.15343954 3.11577874
## R_medPar_enc_delay 6.307229425 10.17581203 15.83021363 0.01899968
## L_aMFG_enc_delay_comb 0.019722563 0.39147069 1.08870929 0.22775188
## L_dlPFC_enc_delay_comb 7.545424953 0.14948429 1.66570997 1.33892074
## L_dMFG_enc_delay_comb 0.440955303 0.81504527 0.61553377 0.08668369
## L_IPS_enc_delay_comb 9.556747667 1.99731423 0.02189456 2.00772081
## L_preSMA_enc_delay_comb 1.673581079 0.23057699 1.27997024 5.97289640
## R_dlPFC_enc_delay_comb 10.005880585 13.38772728 0.08541686 7.75929190
## R_IPS_enc_delay_comb 0.982610807 0.09829240 1.28207882 0.10762238
## R_medPar_enc_delay_comb 8.862174146 9.72273689 1.08201955 1.24390311
## Dim.21 Dim.22 Dim.23 Dim.24
## L_aMFG_enc 0.000000e+00 0.000000000 0.00000000 0.000000000
## L_aMFG_delay 8.830110e-01 0.499768730 0.87570406 0.418486909
## L_dlPFC_enc 9.998353e+00 1.905523449 3.91610205 2.702886785
## L_dlPFC_delay 1.780092e-01 0.287023160 1.43484365 1.032513193
## L_dMFG_enc 3.508705e+00 7.212429879 0.30603235 12.353085766
## L_dMFG_delay 9.909390e-02 5.729591056 0.72408473 8.065953919
## L_IPS_enc 5.029401e-01 0.936744494 8.10147045 0.170651311
## L_IPS_delay 5.521610e-01 2.533975817 10.31846484 0.240014369
## L_preSMA_enc 2.936076e-01 0.327458269 7.84272780 0.010832678
## L_preSMA_delay 1.961293e-01 1.984324152 3.61405493 0.738938902
## R_dlPFC_enc 5.266699e+00 0.102882697 0.60871441 1.773200300
## R_dlPFC_delay 3.398146e+00 1.262053326 4.02785249 0.310020067
## R_IPS_enc 1.145978e-02 0.785134323 2.46820943 17.654254759
## R_IPS_delay 1.047153e-02 0.522953292 1.31995469 7.525592219
## R_medPar_enc 5.881066e+00 0.003177578 5.38783020 5.594064753
## R_medPar_delay 7.008768e+00 1.118243417 1.02591193 0.074318895
## L_aMFG_enc_delay 3.446781e+00 1.950817641 3.41825893 1.633538864
## L_dlPFC_enc_delay 3.765138e+00 0.014217121 0.51691043 0.410534319
## L_dMFG_enc_delay 6.266682e+00 3.691071289 1.18566695 3.793452261
## L_IPS_enc_delay 5.433797e-01 12.177061508 11.25690564 1.351342680
## L_preSMA_enc_delay 2.389280e-01 6.651479831 2.80939479 3.908917567
## R_dlPFC_enc_delay 2.938004e+00 4.034473732 11.58052090 0.002840769
## R_IPS_enc_delay 6.936537e-02 3.862718456 0.52969677 1.892676945
## R_medPar_enc_delay 7.204436e+00 3.755120473 0.02244739 2.280664920
## L_aMFG_enc_delay_comb 2.854305e+00 1.615486823 2.83068604 1.352745872
## L_dlPFC_enc_delay_comb 2.029953e+01 1.631105169 0.97527312 0.607296260
## L_dMFG_enc_delay_comb 1.397177e+01 0.455299292 0.20272113 1.862647450
## L_IPS_enc_delay_comb 3.743739e-02 18.122506922 1.38161220 2.214723872
## L_preSMA_enc_delay_comb 4.183294e-05 3.957436104 0.61407565 3.956492963
## R_dlPFC_enc_delay_comb 1.014874e-01 2.624941133 6.48029822 1.371038572
## R_IPS_enc_delay_comb 1.148533e-01 6.744414096 0.26939917 3.770037190
## R_medPar_enc_delay_comb 3.592415e-01 3.500566770 3.95417465 10.926234674
## Dim.25 Dim.26 Dim.27 Dim.28
## L_aMFG_enc 0.000000000 0.00000000 0.000000e+00 0.000000000
## L_aMFG_delay 0.860378677 0.83754188 2.674349e+00 3.818779784
## L_dlPFC_enc 0.099465869 1.98848308 1.585473e+01 0.307890978
## L_dlPFC_delay 0.667530363 4.25919874 5.346068e+00 0.200855111
## L_dMFG_enc 2.271743797 1.76288806 7.731710e-01 1.863149755
## L_dMFG_delay 1.115902849 0.92095597 2.014670e-01 0.242259560
## L_IPS_enc 5.472462450 6.41528253 2.301879e+00 27.359997656
## L_IPS_delay 2.814187909 0.25066743 9.000545e-01 6.242389536
## L_preSMA_enc 5.255830127 7.82405659 1.232492e+01 1.626161528
## L_preSMA_delay 2.102870177 1.74661137 4.990691e+00 0.764225166
## R_dlPFC_enc 13.985520547 7.67062879 2.718750e-01 0.001341535
## R_dlPFC_delay 10.046214629 2.33879585 2.656597e-01 1.195551182
## R_IPS_enc 0.205983584 2.89154378 1.775683e+00 0.209755633
## R_IPS_delay 0.009637319 0.04252043 2.235876e+00 0.343547635
## R_medPar_enc 0.002047625 5.83057377 4.680444e+00 0.049776954
## R_medPar_delay 0.044025623 0.75889286 5.556915e-01 0.335642640
## L_aMFG_enc_delay 3.358437211 3.26929515 1.043916e+01 14.906380729
## L_dlPFC_enc_delay 1.990655774 9.10805659 1.546617e+00 0.188437264
## L_dMFG_enc_delay 12.742955843 0.26675284 3.823322e-04 0.181965445
## L_IPS_enc_delay 1.203527103 1.17041270 2.220554e-01 0.085435869
## L_preSMA_enc_delay 1.298315409 0.16706649 3.147525e+00 0.611457377
## R_dlPFC_enc_delay 9.806998680 0.41955178 2.308963e+00 4.846281996
## R_IPS_enc_delay 0.037926215 2.97320400 2.309658e+00 0.419192598
## R_medPar_enc_delay 0.170792070 0.23243546 2.446574e-01 0.717659665
## L_aMFG_enc_delay_comb 2.781147221 2.70732800 8.644748e+00 12.344086471
## L_dlPFC_enc_delay_comb 1.147237780 2.79152991 4.750068e+00 0.003200887
## L_dMFG_enc_delay_comb 18.748525968 0.49279174 5.964350e-01 2.358335515
## L_IPS_enc_delay_comb 0.495395826 9.26969728 4.873806e-01 14.038046019
## L_preSMA_enc_delay_comb 0.748446793 3.99523439 1.681383e+00 0.114804981
## R_dlPFC_enc_delay_comb 0.040257299 2.96359488 3.492246e+00 4.164688776
## R_IPS_enc_delay_comb 0.294561186 8.98878587 1.892305e-01 0.073363271
## R_medPar_enc_delay_comb 0.181018076 5.64562180 4.786925e+00 0.385338484
## Dim.29 Dim.30 Dim.31 Dim.32
## L_aMFG_enc 0.000000000 0.000000000 0.00000000 5.575459e+01
## L_aMFG_delay 0.607323965 0.191066512 0.09030963 1.985102e+01
## L_dlPFC_enc 1.135842969 0.052151798 1.69366312 1.509929e-29
## L_dlPFC_delay 0.002972700 2.682941297 0.05491458 1.733337e-29
## L_dMFG_enc 1.733690222 2.814161857 7.78708563 2.773339e-30
## L_dMFG_delay 0.003176588 0.860495937 0.75413546 5.825938e-31
## L_IPS_enc 1.478894744 0.242733888 0.20655392 7.703720e-32
## L_IPS_delay 0.560265866 0.007395033 0.52462694 5.565938e-30
## L_preSMA_enc 0.585337737 0.888859195 14.64566253 7.703720e-30
## L_preSMA_delay 0.777995715 0.035051910 0.43879297 4.814825e-31
## R_dlPFC_enc 0.103381419 0.555458598 0.29248486 2.773339e-30
## R_dlPFC_delay 1.703900563 0.246202225 0.08213504 7.703720e-32
## R_IPS_enc 10.855374407 4.349466908 2.45075607 1.733337e-31
## R_IPS_delay 10.917544655 1.377782684 0.18275360 6.933348e-31
## R_medPar_enc 9.246251938 5.666422598 1.09002318 7.703720e-32
## R_medPar_delay 3.819277247 0.608500478 0.17082048 4.814825e-31
## L_aMFG_enc_delay 2.370653129 0.745816814 0.35251829 3.592357e+00
## L_dlPFC_enc_delay 1.126393241 10.163483803 2.70048407 1.733337e-31
## L_dMFG_enc_delay 2.074326159 0.010160947 1.29743931 7.703720e-32
## L_IPS_enc_delay 0.127546844 0.277513508 0.81075161 9.437057e-31
## L_preSMA_enc_delay 6.834700072 0.183471999 3.81245257 5.565938e-30
## R_dlPFC_enc_delay 5.662655284 2.823948419 0.01038084 3.081488e-31
## R_IPS_enc_delay 9.466287105 13.767595270 3.99699618 4.814825e-31
## R_medPar_enc_delay 0.856045009 0.379466243 0.02043730 1.925930e-30
## L_aMFG_enc_delay_comb 1.963155762 0.617616537 0.29192306 2.080203e+01
## L_dlPFC_enc_delay_comb 3.627385745 8.000892670 7.04818039 2.773339e-30
## L_dMFG_enc_delay_comb 5.550419715 1.838702280 11.33328261 7.703720e-32
## L_IPS_enc_delay_comb 0.336711678 0.828796839 0.30740963 2.773339e-30
## L_preSMA_enc_delay_comb 10.027796420 1.482951552 26.50862600 0.000000e+00
## R_dlPFC_enc_delay_comb 3.875633673 4.878008064 0.12909986 9.321501e-30
## R_IPS_enc_delay_comb 0.172381653 27.297347958 9.98286280 3.081488e-31
## R_medPar_enc_delay_comb 2.396677777 6.125536180 0.93243749 1.925930e-30
# Contributions of variables to PC1
fviz_contrib(res.pca, choice = "var", axes = 1, top = 15)
# Contributions of variables to PC2
fviz_contrib(res.pca, choice = "var", axes = 2, top = 10)
# Contributions of variables to PC3
fviz_contrib(res.pca, choice = "var", axes = 3, top = 10)
fviz_pca_var(res.pca,
col.var = "contrib", # Color by contributions to the PC
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE # Avoid text overlapping
)
We’re going to the same thing for the similarity measures, since we have the same issue (though not quite as badly).
We cover 78% of the variance by PC5, so we’ll focus on those.
PC 1: mostly related to within TR high load correct trials (regardless of area) and across TR similarity at high load in the fusiform
PC 2: similarity within TR in low load correct trials in DFR, individual to template across TR in high load correct in fusiform
PC 3: fusiform similarity during encoding (regardless of accuracy), delay similarity in DFR
PC 4: how similar high load incorrect trials are to the template correct trials and low load correct trials in fusiform
PC 5: mostly related to correct trials (regardless of load) in fusiform
res_sim.pca <- prcomp(data_for_reg[,c(31:48)], scale = TRUE)
fviz_eig(res_sim.pca)
summary(res_sim.pca)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 2.7137 1.9660 1.17157 1.0314 0.96550 0.78670 0.69463
## Proportion of Variance 0.4091 0.2147 0.07625 0.0591 0.05179 0.03438 0.02681
## Cumulative Proportion 0.4091 0.6239 0.70011 0.7592 0.81100 0.84539 0.87219
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## Standard deviation 0.63943 0.58355 0.57136 0.52755 0.50901 0.47112 0.38700
## Proportion of Variance 0.02271 0.01892 0.01814 0.01546 0.01439 0.01233 0.00832
## Cumulative Proportion 0.89491 0.91383 0.93196 0.94742 0.96182 0.97415 0.98247
## PC15 PC16 PC17 PC18
## Standard deviation 0.35572 0.32075 0.27761 0.09529
## Proportion of Variance 0.00703 0.00572 0.00428 0.00050
## Cumulative Proportion 0.98950 0.99521 0.99950 1.00000
res_sim.var <- get_pca_var(res_sim.pca)
res_sim.var$contrib # Contributions to the PCs
## Dim.1 Dim.2 Dim.3 Dim.4
## L_dlPFC_enc_delay_comb 6.918623 5.480412 1.41447237 11.00358811
## L_dMFG_enc_delay_comb 6.273713 4.350938 1.17159880 6.33267954
## L_IPS_enc_delay_comb 6.096469 7.967627 0.20029076 3.96465303
## L_preSMA_enc_delay_comb 6.686211 4.873785 1.79377045 11.67198387
## R_dlPFC_enc_delay_comb 7.629073 4.646966 0.45667515 4.69679365
## R_IPS_enc_delay_comb 5.996474 7.056570 2.29757771 10.99823360
## R_medPar_enc_delay_comb 5.265801 2.642132 8.25964094 21.87120937
## R_all_delay_enc_delay_comb 9.256676 7.798231 0.07571221 0.09364537
## high_corr_fus_enc 6.094995 3.852201 10.18190861 9.39817483
## high_corr_fus_delay 4.609297 7.348209 0.06477430 1.78084606
## high_incorr_fus_enc 3.463601 4.748518 14.89518607 2.96188398
## high_incorr_fus_del 2.264186 4.221773 1.27383154 0.64588340
## low_corr_fus_enc 2.843233 5.440746 25.08951936 0.01029436
## low_corr_fus_del 4.715895 8.017147 0.21085651 0.77541801
## correct_encoding_to_correct_delay_fus 4.365219 6.293588 7.56091565 3.49542573
## correct_enc_to_delay_fus_high_corr 4.583100 7.206993 16.06765633 2.75853442
## enc_to_correct_delay_fus_high_corr 6.189247 5.881304 8.21931577 0.04772051
## high_corr_DFR_enc 6.748187 2.172860 0.76629748 7.49303217
## Dim.5 Dim.6 Dim.7
## L_dlPFC_enc_delay_comb 0.2085356 0.01386202 1.493229e+00
## L_dMFG_enc_delay_comb 1.3480550 0.01548475 1.291436e-01
## L_IPS_enc_delay_comb 0.1599291 0.18277789 9.965150e+00
## L_preSMA_enc_delay_comb 0.5923387 0.59812404 1.260117e+00
## R_dlPFC_enc_delay_comb 0.1851407 0.01036687 6.962444e-01
## R_IPS_enc_delay_comb 1.6078286 0.12547225 1.044868e+01
## R_medPar_enc_delay_comb 0.6583775 0.41550657 4.592828e+00
## R_all_delay_enc_delay_comb 0.1197864 0.02926260 3.841576e-01
## high_corr_fus_enc 2.5215321 0.15023831 3.868029e-01
## high_corr_fus_delay 13.6974956 18.03835425 9.455616e-01
## high_incorr_fus_enc 4.8843943 14.65712365 1.190280e+01
## high_incorr_fus_del 33.2262992 49.69695979 5.979452e-01
## low_corr_fus_enc 0.1392294 4.75584405 1.405305e-04
## low_corr_fus_del 9.7206117 4.75235522 1.184072e+01
## correct_encoding_to_correct_delay_fus 19.0945668 3.40601534 7.845392e-02
## correct_enc_to_delay_fus_high_corr 1.3384667 2.84378164 1.680355e-01
## enc_to_correct_delay_fus_high_corr 10.3215390 0.24836452 1.772019e-01
## high_corr_DFR_enc 0.1758735 0.06010625 4.493279e+01
## Dim.8 Dim.9 Dim.10
## L_dlPFC_enc_delay_comb 0.015206735 7.61223592 0.008173188
## L_dMFG_enc_delay_comb 31.875125302 24.87865430 3.920321457
## L_IPS_enc_delay_comb 2.412054492 0.46043228 20.409576126
## L_preSMA_enc_delay_comb 0.010421251 0.57932753 3.887309020
## R_dlPFC_enc_delay_comb 10.787609307 2.18526879 10.153171840
## R_IPS_enc_delay_comb 0.722548962 0.03790630 0.279153975
## R_medPar_enc_delay_comb 1.537794489 9.97326675 5.219164756
## R_all_delay_enc_delay_comb 0.268227072 0.02520042 0.116667059
## high_corr_fus_enc 0.301924251 0.06851873 2.832955843
## high_corr_fus_delay 8.235283821 0.08069533 8.985985154
## high_incorr_fus_enc 14.812247708 2.91581725 0.971354020
## high_incorr_fus_del 0.990291315 4.20528366 0.188923359
## low_corr_fus_enc 22.727916855 25.52308636 1.112727274
## low_corr_fus_del 0.137643174 7.03599510 23.429391268
## correct_encoding_to_correct_delay_fus 2.181387037 0.52908400 0.330916519
## correct_enc_to_delay_fus_high_corr 0.007863228 1.69728478 13.272578018
## enc_to_correct_delay_fus_high_corr 1.636492932 0.39220878 2.776128948
## high_corr_DFR_enc 1.339962068 11.79973371 2.105502177
## Dim.11 Dim.12 Dim.13
## L_dlPFC_enc_delay_comb 8.715112e+00 2.50041427 3.952299127
## L_dMFG_enc_delay_comb 3.814625e-04 14.61932504 2.318927967
## L_IPS_enc_delay_comb 1.561125e+00 10.35457704 1.253503103
## L_preSMA_enc_delay_comb 2.588207e+00 1.80760849 50.028876059
## R_dlPFC_enc_delay_comb 1.540546e+01 6.31893924 12.835560242
## R_IPS_enc_delay_comb 2.662028e+00 2.34880888 0.180467154
## R_medPar_enc_delay_comb 1.516174e-04 19.77140590 5.189198892
## R_all_delay_enc_delay_comb 4.718356e-01 0.20762222 0.224947693
## high_corr_fus_enc 2.695366e+01 0.09269949 6.391792945
## high_corr_fus_delay 7.333003e-04 4.76761656 3.450004335
## high_incorr_fus_enc 8.016745e+00 11.53408090 0.099059686
## high_incorr_fus_del 1.508422e+00 0.25970123 0.546755116
## low_corr_fus_enc 1.682253e+00 0.14915870 1.950984418
## low_corr_fus_del 6.851174e-02 7.16266163 10.403707568
## correct_encoding_to_correct_delay_fus 9.089736e+00 1.76409454 0.001123374
## correct_enc_to_delay_fus_high_corr 1.870442e-01 2.60453552 0.818908730
## enc_to_correct_delay_fus_high_corr 1.403616e+01 0.03868600 0.051565892
## high_corr_DFR_enc 7.052436e+00 13.69806437 0.302317699
## Dim.14 Dim.15 Dim.16
## L_dlPFC_enc_delay_comb 31.64783824 9.943012465 5.86598254
## L_dMFG_enc_delay_comb 0.01202585 1.423926342 0.23574437
## L_IPS_enc_delay_comb 3.89871366 2.221367488 22.57880242
## L_preSMA_enc_delay_comb 9.08884435 0.087014431 3.57194324
## R_dlPFC_enc_delay_comb 7.07047136 9.385721690 0.62795798
## R_IPS_enc_delay_comb 4.97227312 4.501286550 34.59705431
## R_medPar_enc_delay_comb 0.04555327 0.752927547 11.12864503
## R_all_delay_enc_delay_comb 0.03993065 0.027691649 0.12268747
## high_corr_fus_enc 16.03962835 0.968417567 1.12880975
## high_corr_fus_delay 4.67411473 8.129302510 2.05730364
## high_incorr_fus_enc 2.27008353 1.579542644 0.24153868
## high_incorr_fus_del 0.19117346 0.005941106 0.16712811
## low_corr_fus_enc 7.63250731 0.002761028 0.31269098
## low_corr_fus_del 0.01340532 1.191104536 7.92739732
## correct_encoding_to_correct_delay_fus 3.50505208 36.202696286 2.02547441
## correct_enc_to_delay_fus_high_corr 0.98120302 17.034477631 0.07474867
## enc_to_correct_delay_fus_high_corr 7.20963604 6.240517906 7.30077644
## high_corr_DFR_enc 0.70754565 0.302290623 0.03531465
## Dim.17 Dim.18
## L_dlPFC_enc_delay_comb 1.662969465 1.544035e+00
## L_dMFG_enc_delay_comb 0.069093333 1.024862e+00
## L_IPS_enc_delay_comb 4.384493279 1.928458e+00
## L_preSMA_enc_delay_comb 0.027247776 8.468707e-01
## R_dlPFC_enc_delay_comb 1.482921821 5.425659e+00
## R_IPS_enc_delay_comb 4.841005631 6.326628e+00
## R_medPar_enc_delay_comb 0.619741276 2.056655e+00
## R_all_delay_enc_delay_comb 0.364783590 8.037294e+01
## high_corr_fus_enc 12.501638130 1.341019e-01
## high_corr_fus_delay 13.133682085 7.407353e-04
## high_incorr_fus_enc 0.044845972 1.179011e-03
## high_incorr_fus_del 0.005674004 3.828901e-03
## low_corr_fus_enc 0.624108664 2.798553e-03
## low_corr_fus_del 2.595246884 1.935584e-03
## correct_encoding_to_correct_delay_fus 0.050035684 2.621572e-02
## correct_enc_to_delay_fus_high_corr 28.311900467 4.288751e-02
## enc_to_correct_delay_fus_high_corr 29.015591819 2.175448e-01
## high_corr_DFR_enc 0.265020119 4.266461e-02
# Contributions of variables to PC1
fviz_contrib(res_sim.pca, choice = "var", axes = 1, top = 10)
# Contributions of variables to PC2
fviz_contrib(res_sim.pca, choice = "var", axes = 2, top = 10)
# Contributions of variables to PC3
fviz_contrib(res_sim.pca, choice = "var", axes = 3, top = 10)
# Contributions of variables to PC4
fviz_contrib(res_sim.pca, choice = "var", axes = 4, top = 10)
# Contributions of variables to PC5
fviz_contrib(res_sim.pca, choice = "var", axes = 5, top = 10)
fviz_pca_var(res_sim.pca,
col.var = "contrib", # Color by contributions to the PC
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE # Avoid text overlapping
)
Now, let’s do the same for the MVPA measures.
res_MVPA.pca <- prcomp(data_for_reg[,c(57:92)], scale = TRUE)
fviz_eig(res_MVPA.pca)
summary(res_MVPA.pca)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 2.1733 1.9754 1.88085 1.69900 1.61812 1.50461 1.17208
## Proportion of Variance 0.1312 0.1084 0.09827 0.08018 0.07273 0.06289 0.03816
## Cumulative Proportion 0.1312 0.2396 0.33785 0.41804 0.49077 0.55365 0.59181
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## Standard deviation 1.14686 1.12981 0.9859 0.97778 0.91642 0.87732 0.85316
## Proportion of Variance 0.03654 0.03546 0.0270 0.02656 0.02333 0.02138 0.02022
## Cumulative Proportion 0.62835 0.66381 0.6908 0.71736 0.74069 0.76207 0.78229
## PC15 PC16 PC17 PC18 PC19 PC20 PC21
## Standard deviation 0.81320 0.8027 0.77550 0.75619 0.73849 0.70641 0.70461
## Proportion of Variance 0.01837 0.0179 0.01671 0.01588 0.01515 0.01386 0.01379
## Cumulative Proportion 0.80066 0.8186 0.83526 0.85115 0.86630 0.88016 0.89395
## PC22 PC23 PC24 PC25 PC26 PC27 PC28
## Standard deviation 0.67451 0.64717 0.63189 0.59873 0.57657 0.53769 0.52688
## Proportion of Variance 0.01264 0.01163 0.01109 0.00996 0.00923 0.00803 0.00771
## Cumulative Proportion 0.90659 0.91822 0.92931 0.93927 0.94850 0.95654 0.96425
## PC29 PC30 PC31 PC32 PC33 PC34 PC35
## Standard deviation 0.51823 0.44246 0.4155 0.40917 0.38482 0.36243 0.33344
## Proportion of Variance 0.00746 0.00544 0.0048 0.00465 0.00411 0.00365 0.00309
## Cumulative Proportion 0.97171 0.97714 0.9819 0.98659 0.99070 0.99435 0.99744
## PC36
## Standard deviation 0.30350
## Proportion of Variance 0.00256
## Cumulative Proportion 1.00000
res_MVPA.var <- get_pca_var(res_MVPA.pca)
#res_MVPA.var$contrib # Contributions to the PCs
# Contributions of variables to PC1
fviz_contrib(res_MVPA.pca, choice = "var", axes = 1, top = 10)
# Contributions of variables to PC2
fviz_contrib(res_MVPA.pca, choice = "var", axes = 2, top = 10)
# Contributions of variables to PC3
fviz_contrib(res_MVPA.pca, choice = "var", axes = 3, top = 10)
# Contributions of variables to PC4
fviz_contrib(res_MVPA.pca, choice = "var", axes = 4, top = 10)
# Contributions of variables to PC5
fviz_contrib(res_MVPA.pca, choice = "var", axes = 5, top = 10)
# Contributions of variables to PC6
fviz_contrib(res_MVPA.pca, choice = "var", axes = 6, top = 10)
# Contributions of variables to PC7
fviz_contrib(res_MVPA.pca, choice = "var", axes = 7, top = 10)
# Contributions of variables to PC8
fviz_contrib(res_MVPA.pca, choice = "var", axes = 8, top = 10)
# Contributions of variables to PC9
fviz_contrib(res_MVPA.pca, choice = "var", axes = 9, top = 10)
# Contributions of variables to PC10
fviz_contrib(res_MVPA.pca, choice = "var", axes = 10, top = 10)
# Contributions of variables to PC11
fviz_contrib(res_MVPA.pca, choice = "var", axes = 11, top = 10)
# Contributions of variables to PC12
fviz_contrib(res_MVPA.pca, choice = "var", axes = 12, top = 10)
fviz_pca_var(res_MVPA.pca,
col.var = "contrib", # Color by contributions to the PC
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE # Avoid text overlapping
)
Now that we have our PCs to reduce overlapping variance, we can use these in a series of regressions to predict span, BPRS and accuracy at high load. We will use a number of different methods to select features: stepwise regression, ridge regression, LASSO regression and ElasticNet regression. A brief description of the methods are as follows:
Stepwise Regression: Initially takes all parameters and sequentially removes parameters to minimize the AIC of the model overall. At each step, it calculates what the AIC would be if it removed each individual parameter and then removes the one that makes the most difference. Stops when model converges at a minimal AIC.
Ridge Regression: A type of penalized linear regression (uses L2 norm - minimizes the sum of the squared coefficients). Shrinks the values of unimportant coefficients close to zero, but does not remove them.
LASSO Regression: Another type of penalized linear regression (uses L1 norm - minimizes the sum of the absolute value of the coefficients). Forces coefficients that are unimportant to the model to be zero, creating a sparse model.
ElasticNet Regression: Uses both L1- and L2-norm penalties with regression, so shrinks some coefficients close to zero and some to exactly zero.
reg_data <- data.frame(span = data_for_reg$span,
high_acc = data_for_reg$high_acc,
BPRS = data_for_reg$BPRS,
res.pca[["x"]][,1:3],
res_sim.pca[["x"]][,1:5],
res_MVPA.pca[["x"]][,1:12])
colnames(reg_data)[4:6] <- paste("PC",c(1:3),"_univ",sep="")
colnames(reg_data)[7:11] <- paste("PC",c(1:5),"_sim",sep="")
colnames(reg_data)[12:23] <- paste("PC",c(1:12),"_MVPA",sep="")
set.seed(123)
training.samples_span <- reg_data$span %>%
createDataPartition(p = 0.8, list = FALSE)
train.data_span <- reg_data[training.samples_span, c(1, 4:23)]
test.data_span <- reg_data[-training.samples_span, c(1, 4:23)]
# standardize, because that is important for ridge regression
train.data_span <- sapply(train.data_span, scale)
test.data_span <- sapply(test.data_span, scale)
test.data_span <- data.frame(test.data_span)
lambda <- 10^seq(-3, 3, length = 100)
While the final stepwise model is significant overall, we see significant effects of:
step.span <- train(span ~., data = train.data_span,
method = "lmStepAIC",
trControl = trainControl("cv", number = 10),
trace = FALSE
)
# Model accuracy
step.span$results
## parameter RMSE Rsquared MAE RMSESD RsquaredSD MAESD
## 1 none 1.114772 0.1191013 0.8997595 0.2281519 0.1488565 0.1537895
# Final model coefficients
step.span$finalModel
##
## Call:
## lm(formula = .outcome ~ PC1_univ + PC2_univ + PC1_sim + PC2_sim,
## data = dat)
##
## Coefficients:
## (Intercept) PC1_univ PC2_univ PC1_sim PC2_sim
## -5.612e-17 -1.047e+00 -2.108e-01 -1.192e+00 5.858e-01
# Summary of the model
summary(step.span$finalModel)
##
## Call:
## lm(formula = .outcome ~ PC1_univ + PC2_univ + PC1_sim + PC2_sim,
## data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.73984 -0.61112 0.05479 0.58409 2.72770
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.612e-17 8.206e-02 0.000 1.0000
## PC1_univ -1.047e+00 6.946e-01 -1.508 0.1339
## PC2_univ -2.108e-01 1.333e-01 -1.582 0.1160
## PC1_sim -1.192e+00 6.059e-01 -1.968 0.0512 .
## PC2_sim 5.858e-01 3.834e-01 1.528 0.1289
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.964 on 133 degrees of freedom
## Multiple R-squared: 0.09793, Adjusted R-squared: 0.0708
## F-statistic: 3.61 on 4 and 133 DF, p-value: 0.007931
The most important PCs for the Ridge regression predicting span are:
ridge.span <- train(
span ~., data = train.data_span, method = "glmnet",
trControl = trainControl("cv", number = 10),
tuneGrid = expand.grid(alpha = 0, lambda = lambda)
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
# Model coefficients
coef(ridge.span$finalModel, ridge.span$bestTune$lambda)
## 21 x 1 sparse Matrix of class "dgCMatrix"
## 1
## (Intercept) -1.140548e-17
## PC1_univ 3.206506e-02
## PC2_univ -5.304337e-03
## PC3_univ -9.856542e-03
## PC1_sim -3.980457e-02
## PC2_sim -1.032849e-03
## PC3_sim 3.685895e-03
## PC4_sim -5.133846e-03
## PC5_sim 1.200632e-02
## PC1_MVPA 1.056208e-02
## PC2_MVPA -1.567993e-02
## PC3_MVPA -2.350753e-02
## PC4_MVPA -7.769665e-03
## PC5_MVPA -2.238490e-02
## PC6_MVPA -5.646146e-03
## PC7_MVPA -2.060510e-02
## PC8_MVPA 4.535510e-03
## PC9_MVPA 1.075753e-02
## PC10_MVPA -1.173892e-02
## PC11_MVPA -2.379253e-02
## PC12_MVPA 5.852113e-03
# Make predictions
predictions.ridge.span <- ridge.span %>% predict(test.data_span)
# Model prediction performance
data.frame(
RMSE = RMSE(predictions.ridge.span, test.data_span$span),
Rsquare = R2(predictions.ridge.span, test.data_span$span)
)
## RMSE Rsquare
## 1 0.9679113 0.04616593
In the LASSO regression, we see that only PC1 from the univariate information and PC1 from the similarity measures are kept.
lasso.span <- train(
span ~., data = train.data_span, method = "glmnet",
trControl = trainControl("cv", number = 10),
tuneGrid = expand.grid(alpha = 1, lambda = lambda)
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
# Model coefficients
coef(lasso.span$finalModel, lasso.span$bestTune$lambda)
## 21 x 1 sparse Matrix of class "dgCMatrix"
## 1
## (Intercept) -9.061728e-18
## PC1_univ .
## PC2_univ .
## PC3_univ .
## PC1_sim -1.293783e-01
## PC2_sim .
## PC3_sim .
## PC4_sim .
## PC5_sim .
## PC1_MVPA .
## PC2_MVPA .
## PC3_MVPA .
## PC4_MVPA .
## PC5_MVPA .
## PC6_MVPA .
## PC7_MVPA .
## PC8_MVPA .
## PC9_MVPA .
## PC10_MVPA .
## PC11_MVPA .
## PC12_MVPA .
# Make predictions
predictions.lasso.span <- lasso.span %>% predict(test.data_span)
# Model prediction performance
data.frame(
RMSE = RMSE(predictions.lasso.span, test.data_span$span),
Rsquare = R2(predictions.lasso.span, test.data_span$span)
)
## RMSE Rsquare
## 1 0.9621818 0.0557215
Same with ElasticNet
elastic.span <- train(
span ~., data = train.data_span, method = "glmnet",
trControl = trainControl("cv", number = 10),
tuneLength = 10
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
# Model coefficients
coef(elastic.span$finalModel, elastic.span$bestTune$lambda)
## 21 x 1 sparse Matrix of class "dgCMatrix"
## 1
## (Intercept) -9.330919e-18
## PC1_univ .
## PC2_univ .
## PC3_univ .
## PC1_sim -1.035152e-01
## PC2_sim .
## PC3_sim .
## PC4_sim .
## PC5_sim .
## PC1_MVPA .
## PC2_MVPA .
## PC3_MVPA .
## PC4_MVPA .
## PC5_MVPA .
## PC6_MVPA .
## PC7_MVPA .
## PC8_MVPA .
## PC9_MVPA .
## PC10_MVPA .
## PC11_MVPA .
## PC12_MVPA .
# Make predictions
predictions.elastic.span <- elastic.span %>% predict(test.data_span)
# Model prediction performance
data.frame(
RMSE = RMSE(predictions.elastic.span, test.data_span$span),
Rsquare = R2(predictions.elastic.span, test.data_span$span)
)
## RMSE Rsquare
## 1 0.9652913 0.0557215
When comparing models, ElasticNet seems to be our best model - it explains ~12% of variance in span, and has the lowest median RMSE.
models.span <- list(ridge = ridge.span, lasso = lasso.span, elastic = elastic.span, stepwise = step.span)
resamples(models.span) %>% summary( metric = "RMSE")
##
## Call:
## summary.resamples(object = ., metric = "RMSE")
##
## Models: ridge, lasso, elastic, stepwise
## Number of resamples: 10
##
## RMSE
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## ridge 0.7575637 0.8884603 0.9618219 0.9838433 1.096796 1.255309 0
## lasso 0.6602968 0.8675355 0.9555804 0.9627795 1.042191 1.220486 0
## elastic 0.7491220 0.9228491 0.9808799 0.9723596 1.006626 1.149482 0
## stepwise 0.8191630 0.9399231 1.0809590 1.1147722 1.263429 1.481008 0
resamples(models.span) %>% summary( metric = "Rsquared")
##
## Call:
## summary.resamples(object = ., metric = "Rsquared")
##
## Models: ridge, lasso, elastic, stepwise
## Number of resamples: 10
##
## Rsquared
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## ridge 5.864103e-08 0.01708395 0.05183783 0.08074772 0.1246915 0.2662473 0
## lasso 1.304693e-02 0.02608088 0.06735218 0.09679395 0.1384883 0.2905486 0
## elastic 5.693491e-03 0.04103814 0.06829902 0.12227461 0.1865343 0.3923610 0
## stepwise 3.472879e-04 0.03852093 0.04621665 0.11910126 0.1392710 0.3894256 0
set.seed(123)
training.samples_BPRS <- reg_data$BPRS %>%
createDataPartition(p = 0.8, list = FALSE)
train.data_BPRS <- reg_data[training.samples_BPRS, c(3, 4:23)]
test.data_BPRS <- reg_data[-training.samples_BPRS, c(3, 4:23)]
# standardize, because that is important for ridge regression
train.data_BPRS <- sapply(train.data_BPRS, scale)
test.data_BPRS <- sapply(test.data_BPRS, scale)
test.data_BPRS <- data.frame(test.data_BPRS)
lambda <- 10^seq(-3, 3, length = 100)
To predict BPRS, we see the PCs that contain important information are:
step.BPRS <- train(BPRS ~., data = train.data_BPRS,
method = "lmStepAIC",
trControl = trainControl("cv", number = 10),
trace = FALSE
)
# Model accuracy
step.BPRS$results
## parameter RMSE Rsquared MAE RMSESD RsquaredSD MAESD
## 1 none 0.9865915 0.1226856 0.7708434 0.2864079 0.1338461 0.1758591
# Final model coefficients
step.BPRS$finalModel
##
## Call:
## lm(formula = .outcome ~ PC1_univ + PC2_univ + PC3_univ + PC1_sim +
## PC2_sim + PC5_sim + PC4_MVPA + PC7_MVPA, data = dat)
##
## Coefficients:
## (Intercept) PC1_univ PC2_univ PC3_univ PC1_sim PC2_sim
## 2.816e-17 -1.981e+00 -2.803e-01 3.977e-01 -1.639e+00 1.175e+00
## PC5_sim PC4_MVPA PC7_MVPA
## -2.874e-01 1.938e-01 1.411e-01
# Summary of the model
summary(step.BPRS$finalModel)
##
## Call:
## lm(formula = .outcome ~ PC1_univ + PC2_univ + PC3_univ + PC1_sim +
## PC2_sim + PC5_sim + PC4_MVPA + PC7_MVPA, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.5640 -0.6657 -0.2147 0.4188 3.9747
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.816e-17 7.944e-02 0.000 1.000000
## PC1_univ -1.981e+00 1.036e+00 -1.912 0.058156 .
## PC2_univ -2.803e-01 1.851e-01 -1.514 0.132394
## PC3_univ 3.977e-01 1.317e-01 3.020 0.003045 **
## PC1_sim -1.639e+00 8.875e-01 -1.846 0.067152 .
## PC2_sim 1.175e+00 5.915e-01 1.986 0.049184 *
## PC5_sim -2.874e-01 8.178e-02 -3.514 0.000609 ***
## PC4_MVPA 1.938e-01 8.378e-02 2.313 0.022301 *
## PC7_MVPA 1.411e-01 8.085e-02 1.745 0.083334 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9332 on 129 degrees of freedom
## Multiple R-squared: 0.18, Adjusted R-squared: 0.1291
## F-statistic: 3.539 on 8 and 129 DF, p-value: 0.000979
Ridge regression gives us the following important PCs (similar to above) :
ridge.BPRS <- train(
BPRS ~., data = train.data_BPRS, method = "glmnet",
trControl = trainControl("cv", number = 10),
tuneGrid = expand.grid(alpha = 0, lambda = lambda)
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
# Model coefficients
coef(ridge.BPRS$finalModel, ridge.BPRS$bestTune$lambda)
## 21 x 1 sparse Matrix of class "dgCMatrix"
## 1
## (Intercept) 2.381245e-17
## PC1_univ -7.091231e-03
## PC2_univ 1.289033e-02
## PC3_univ 3.171805e-02
## PC1_sim 2.303936e-02
## PC2_sim 2.661347e-02
## PC3_sim -4.095941e-02
## PC4_sim 2.965393e-02
## PC5_sim -6.769529e-02
## PC1_MVPA -8.821387e-03
## PC2_MVPA -1.133490e-02
## PC3_MVPA 8.678659e-03
## PC4_MVPA 5.443479e-02
## PC5_MVPA 1.359917e-03
## PC6_MVPA 1.696657e-04
## PC7_MVPA 3.707152e-02
## PC8_MVPA 1.935436e-02
## PC9_MVPA 2.676909e-02
## PC10_MVPA -1.695855e-02
## PC11_MVPA -1.270204e-02
## PC12_MVPA -1.915871e-02
# Make predictions
predictions.ridge.BPRS <- ridge.BPRS %>% predict(test.data_BPRS)
# Model prediction performance
data.frame(
RMSE = RMSE(predictions.ridge.BPRS, test.data_BPRS$BPRS),
Rsquare = R2(predictions.ridge.BPRS, test.data_BPRS$BPRS)
)
## RMSE Rsquare
## 1 0.9798401 0.010319
LASSO gives us almost a combination of the above analyses:
lasso.BPRS <- train(
BPRS ~., data = train.data_BPRS, method = "glmnet",
trControl = trainControl("cv", number = 10),
tuneGrid = expand.grid(alpha = 1, lambda = lambda)
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
# Model coefficients
coef(lasso.BPRS$finalModel, lasso.BPRS$bestTune$lambda)
## 21 x 1 sparse Matrix of class "dgCMatrix"
## 1
## (Intercept) 2.324387e-17
## PC1_univ .
## PC2_univ .
## PC3_univ 5.219580e-02
## PC1_sim .
## PC2_sim .
## PC3_sim -5.779022e-02
## PC4_sim .
## PC5_sim -1.604588e-01
## PC1_MVPA .
## PC2_MVPA .
## PC3_MVPA .
## PC4_MVPA 1.210893e-01
## PC5_MVPA .
## PC6_MVPA .
## PC7_MVPA 4.894366e-02
## PC8_MVPA .
## PC9_MVPA 2.044713e-03
## PC10_MVPA .
## PC11_MVPA .
## PC12_MVPA .
# Make predictions
predictions.lasso.BPRS <- lasso.BPRS %>% predict(test.data_BPRS)
# Model prediction performance
data.frame(
RMSE = RMSE(predictions.lasso.BPRS, test.data_BPRS$BPRS),
Rsquare = R2(predictions.lasso.BPRS, test.data_BPRS$BPRS)
)
## RMSE Rsquare
## 1 1.012711 0.0001492193
ElasticNet ends up mostly the same as LASSO, but adds in one additional parameter.
elastic.BPRS <- train(
BPRS ~., data = train.data_BPRS, method = "glmnet",
trControl = trainControl("cv", number = 10),
tuneLength = 10
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
# Model coefficients
coef(elastic.BPRS$finalModel, elastic.BPRS$bestTune$lambda)
## 21 x 1 sparse Matrix of class "dgCMatrix"
## 1
## (Intercept) 2.393051e-17
## PC1_univ .
## PC2_univ .
## PC3_univ 4.076312e-02
## PC1_sim .
## PC2_sim 7.017747e-04
## PC3_sim -6.079961e-02
## PC4_sim 1.138820e-02
## PC5_sim -1.472034e-01
## PC1_MVPA .
## PC2_MVPA .
## PC3_MVPA .
## PC4_MVPA 1.130421e-01
## PC5_MVPA .
## PC6_MVPA .
## PC7_MVPA 4.917944e-02
## PC8_MVPA .
## PC9_MVPA 7.619003e-03
## PC10_MVPA .
## PC11_MVPA .
## PC12_MVPA .
# Make predictions
predictions.elastic.BPRS <- elastic.BPRS %>% predict(test.data_BPRS)
# Model prediction performance
data.frame(
RMSE = RMSE(predictions.elastic.BPRS, test.data_BPRS$BPRS),
Rsquare = R2(predictions.elastic.BPRS, test.data_BPRS$BPRS)
)
## RMSE Rsquare
## 1 1.009694 0.0001245619
These models can explain less variance than in span - only about 3-7%. It seems as though LASSO is doing the best, with the 2nd highest explained variance but the lowest RMSE.
models.BPRS <- list(ridge = ridge.BPRS, lasso = lasso.BPRS, elastic = elastic.BPRS, stepwise = step.BPRS)
resamples(models.BPRS) %>% summary( metric = "RMSE")
##
## Call:
## summary.resamples(object = ., metric = "RMSE")
##
## Models: ridge, lasso, elastic, stepwise
## Number of resamples: 10
##
## RMSE
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## ridge 0.6751430 0.8810408 0.9330348 0.9589276 0.9761646 1.495356 0
## lasso 0.6129708 0.7148376 0.8631748 0.9351477 0.9829447 1.609346 0
## elastic 0.6735014 0.8813779 0.9185624 0.9693396 1.0361628 1.452554 0
## stepwise 0.6356485 0.8259940 0.9710757 0.9865915 1.0797458 1.625920 0
resamples(models.BPRS) %>% summary( metric = "Rsquared")
##
## Call:
## summary.resamples(object = ., metric = "Rsquared")
##
## Models: ridge, lasso, elastic, stepwise
## Number of resamples: 10
##
## Rsquared
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## ridge 5.557396e-03 0.017963188 0.03482652 0.06736287 0.1152648 0.1801430
## lasso 2.490643e-04 0.004153578 0.06573379 0.09669120 0.1133624 0.3342253
## elastic 2.463748e-06 0.006433070 0.04153660 0.10106518 0.1461022 0.4257668
## stepwise 2.808257e-03 0.026042606 0.07563631 0.12268561 0.2037714 0.3978196
## NA's
## ridge 0
## lasso 0
## elastic 0
## stepwise 0
set.seed(123)
training.samples_high_acc <- reg_data$high_acc %>%
createDataPartition(p = 0.8, list = FALSE)
train.data_high_acc <- reg_data[training.samples_high_acc, c(2, 4:23)]
test.data_high_acc <- reg_data[-training.samples_high_acc, c(2, 4:23)]
train.data_high_acc <- sapply(train.data_high_acc, scale)
test.data_high_acc <- sapply(test.data_high_acc, scale)
test.data_high_acc <- data.frame(test.data_high_acc)
lambda <- 10^seq(-3, 3, length = 100)
step.high_acc <- train(high_acc ~., data = train.data_high_acc,
method = "lmStepAIC",
trControl = trainControl("cv", number = 10),
trace = FALSE
)
# Model accuracy
step.high_acc$results
## parameter RMSE Rsquared MAE RMSESD RsquaredSD MAESD
## 1 none 0.8507368 0.301811 0.6756329 0.09371362 0.1598043 0.06510151
# Final model coefficients
step.high_acc$finalModel
##
## Call:
## lm(formula = .outcome ~ PC1_univ + PC2_univ + PC3_univ + PC1_sim +
## PC2_sim + PC3_sim + PC4_sim + PC2_MVPA + PC4_MVPA + PC6_MVPA +
## PC10_MVPA + PC11_MVPA, data = dat)
##
## Coefficients:
## (Intercept) PC1_univ PC2_univ PC3_univ PC1_sim PC2_sim
## -5.754e-16 -1.574e+00 -2.996e-01 8.089e-01 -1.785e+00 9.608e-01
## PC3_sim PC4_sim PC2_MVPA PC4_MVPA PC6_MVPA PC10_MVPA
## 5.293e-01 -6.723e-01 1.863e-01 -1.811e-01 1.204e-01 -2.125e-01
## PC11_MVPA
## -1.884e-01
# Summary of the model
summary(step.high_acc$finalModel)
##
## Call:
## lm(formula = .outcome ~ PC1_univ + PC2_univ + PC3_univ + PC1_sim +
## PC2_sim + PC3_sim + PC4_sim + PC2_MVPA + PC4_MVPA + PC6_MVPA +
## PC10_MVPA + PC11_MVPA, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.2515 -0.4910 0.0972 0.4933 2.5137
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.754e-16 6.792e-02 0.000 1.00000
## PC1_univ -1.574e+00 9.369e-01 -1.680 0.09542 .
## PC2_univ -2.996e-01 1.768e-01 -1.695 0.09256 .
## PC3_univ 8.089e-01 2.450e-01 3.301 0.00126 **
## PC1_sim -1.785e+00 8.088e-01 -2.207 0.02914 *
## PC2_sim 9.608e-01 5.305e-01 1.811 0.07251 .
## PC3_sim 5.293e-01 1.243e-01 4.257 4.06e-05 ***
## PC4_sim -6.723e-01 2.069e-01 -3.249 0.00149 **
## PC2_MVPA 1.863e-01 7.263e-02 2.566 0.01149 *
## PC4_MVPA -1.811e-01 7.040e-02 -2.573 0.01127 *
## PC6_MVPA 1.204e-01 7.004e-02 1.719 0.08816 .
## PC10_MVPA -2.125e-01 7.041e-02 -3.018 0.00309 **
## PC11_MVPA -1.884e-01 7.262e-02 -2.594 0.01062 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.795 on 124 degrees of freedom
## Multiple R-squared: 0.4238, Adjusted R-squared: 0.368
## F-statistic: 7.6 on 12 and 124 DF, p-value: 2.112e-10
In this one, it’s hard to interpret since all of the coefficients have more or less the same magnitude.
ridge.high_acc <- train(
high_acc ~., data = train.data_high_acc, method = "glmnet",
trControl = trainControl("cv", number = 10),
tuneGrid = expand.grid(alpha = 0, lambda = lambda)
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
# Model coefficients
coef(ridge.high_acc$finalModel, ridge.high_acc$bestTune$lambda)
## 21 x 1 sparse Matrix of class "dgCMatrix"
## 1
## (Intercept) -4.454290e-16
## PC1_univ 1.066484e-01
## PC2_univ -8.266385e-02
## PC3_univ 2.668530e-01
## PC1_sim -3.147307e-01
## PC2_sim -3.154310e-02
## PC3_sim 3.345665e-01
## PC4_sim -3.581330e-01
## PC5_sim 5.479195e-02
## PC1_MVPA 1.219934e-02
## PC2_MVPA 1.720979e-01
## PC3_MVPA -5.791474e-02
## PC4_MVPA -1.731201e-01
## PC5_MVPA 2.582833e-02
## PC6_MVPA 1.030284e-01
## PC7_MVPA -2.837541e-02
## PC8_MVPA 9.034467e-02
## PC9_MVPA 5.330388e-02
## PC10_MVPA -2.075835e-01
## PC11_MVPA -1.799489e-01
## PC12_MVPA 3.979497e-02
# Make predictions
predictions.ridge.high_acc <- ridge.high_acc %>% predict(test.data_high_acc)
# Model prediction performance
data.frame(
RMSE = RMSE(predictions.ridge.high_acc, test.data_high_acc$high_acc),
Rsquare = R2(predictions.ridge.high_acc, test.data_high_acc$high_acc)
)
## RMSE Rsquare
## 1 1.004651 0.08762195
lasso.high_acc <- train(
high_acc ~., data = train.data_high_acc, method = "glmnet",
trControl = trainControl("cv", number = 10),
tuneGrid = expand.grid(alpha = 1, lambda = lambda)
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
# Model coefficients
coef(lasso.high_acc$finalModel, lasso.high_acc$bestTune$lambda)
## 21 x 1 sparse Matrix of class "dgCMatrix"
## 1
## (Intercept) -4.374066e-16
## PC1_univ .
## PC2_univ -1.120619e-01
## PC3_univ .
## PC1_sim -3.534061e-01
## PC2_sim .
## PC3_sim 1.726095e-01
## PC4_sim -1.031671e-01
## PC5_sim 5.166475e-02
## PC1_MVPA .
## PC2_MVPA 1.235859e-01
## PC3_MVPA -4.365357e-02
## PC4_MVPA -1.571936e-01
## PC5_MVPA .
## PC6_MVPA 6.109042e-02
## PC7_MVPA .
## PC8_MVPA 6.435941e-02
## PC9_MVPA 1.477527e-02
## PC10_MVPA -1.760170e-01
## PC11_MVPA -1.606009e-01
## PC12_MVPA .
# Make predictions
predictions.lasso.high_acc <- lasso.high_acc %>% predict(test.data_high_acc)
# Model prediction performance
data.frame(
RMSE = RMSE(predictions.lasso.high_acc, test.data_high_acc$high_acc),
Rsquare = R2(predictions.lasso.high_acc, test.data_high_acc$high_acc)
)
## RMSE Rsquare
## 1 0.9621296 0.1035457
When we look at ElasticNet, all coefficients play a role. The most important ones seem to be:
elastic.high_acc <- train(
high_acc ~., data = train.data_high_acc, method = "glmnet",
trControl = trainControl("cv", number = 10),
tuneLength = 10
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
# Model coefficients
coef(elastic.high_acc$finalModel, elastic.high_acc$bestTune$lambda)
## 21 x 1 sparse Matrix of class "dgCMatrix"
## 1
## (Intercept) -4.318103e-16
## PC1_univ -1.192352e+00
## PC2_univ -2.358802e-01
## PC3_univ 7.947027e-01
## PC1_sim -1.469500e+00
## PC2_sim 7.225520e-01
## PC3_sim 5.497432e-01
## PC4_sim -7.042848e-01
## PC5_sim -1.201486e-02
## PC1_MVPA 2.350144e-02
## PC2_MVPA 1.920033e-01
## PC3_MVPA -4.061700e-02
## PC4_MVPA -1.757091e-01
## PC5_MVPA 4.583740e-02
## PC6_MVPA 1.182798e-01
## PC7_MVPA -2.869839e-02
## PC8_MVPA 8.065382e-02
## PC9_MVPA 5.916169e-02
## PC10_MVPA -2.184148e-01
## PC11_MVPA -1.867733e-01
## PC12_MVPA 6.111970e-02
# Make predictions
predictions.elastic.high_acc <- elastic.high_acc %>% predict(test.data_high_acc)
# Model prediction performance
data.frame(
RMSE = RMSE(predictions.elastic.high_acc, test.data_high_acc$high_acc),
Rsquare = R2(predictions.elastic.high_acc, test.data_high_acc$high_acc)
)
## RMSE Rsquare
## 1 1.04074 0.1044495
These models explain much more variance than either span or BPRS - on the level of 25-40% of the variance. ElasticNet drastically outperforms the other methods, with 40% variance explained and the second lowest RMSE.
models.high_acc <- list(ridge = ridge.high_acc, lasso = lasso.high_acc, elastic = elastic.high_acc, stepwise = step.high_acc)
resamples(models.high_acc) %>% summary( metric = "RMSE")
##
## Call:
## summary.resamples(object = ., metric = "RMSE")
##
## Models: ridge, lasso, elastic, stepwise
## Number of resamples: 10
##
## RMSE
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## ridge 0.6308345 0.7522487 0.9101837 0.8724614 0.9800931 1.1061641 0
## lasso 0.6270153 0.7610878 0.9142889 0.8710046 0.9671869 1.0394208 0
## elastic 0.6392307 0.7129727 0.8577051 0.8401602 0.9526558 1.0843036 0
## stepwise 0.7126859 0.7960984 0.8238317 0.8507368 0.9329204 0.9885754 0
resamples(models.high_acc) %>% summary( metric = "Rsquared")
##
## Call:
## summary.resamples(object = ., metric = "Rsquared")
##
## Models: ridge, lasso, elastic, stepwise
## Number of resamples: 10
##
## Rsquared
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## ridge 0.005128648 0.2032264 0.2563208 0.2755459 0.4014508 0.5220665 0
## lasso 0.011479347 0.1572872 0.3046061 0.3019020 0.4494553 0.5423697 0
## elastic 0.147183003 0.1779950 0.3934752 0.3729224 0.4585395 0.7575379 0
## stepwise 0.071274746 0.2263239 0.2603046 0.3018110 0.3220174 0.6640016 0